Improving Temporal Difference Learning Performance in Backgammon Variants

نویسندگان

  • Nikolaos Papahristou
  • Ioannis Refanidis
چکیده

Palamedes is an ongoing project for building expert playing bots that can play backgammon variants. As in all successful modern backgammon programs, it is based on neural networks trained using temporal difference learning. This paper improves upon the training method that we used in our previous approach for the two backgammon variants popular in Greece and neighboring countries, Plakoto and Fevga. We show that the proposed methods result both in faster learning as well as better performance. We also present insights about the selection of the features in our experiments that can be useful to temporal difference learning in other games as well.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Computationally Intensive and Noisy Tasks: Co-Evolutionary Learning and Temporal Difference Learning on Backgammon

The most difficult but realistic learning tasks are both noisy and computationally intensive. This paper investigates how, for a given solution representation, coevolutionary learning can achieve the highest ability from the least computation time. Using a population of Backgammon strategies, this paper examines ways to make computational costs reasonable. With the same simple architecture Gera...

متن کامل

Why Co-Evolution beats Temporal Difference learning at Backgammon for a linear architecture, but not a non-linear architecture

The No Free Lunch theorems show that the algorithm must suit the problem. This does not answer the novice’s question: for a given problem, which algorithm to use? This paper compares co-evolutionary learning and temporal difference learning on the game of Backgammon, which (like many real-world tasks) has an element of random uncertainty. Unfortunately, to fully evaluate a single strategy using...

متن کامل

TDLeaf(lambda): Combining Temporal Difference Learning with Game-Tree Search

In this paper we present TDLeaf(λ), a variation on the TD(λ) algorithm that enables it to be used in conjunction with minimax search. We present some experiments in both chess and backgammon which demonstrate its utility and provide comparisons with TD(λ) and another less radical variant, TD-directed(λ). In particular, our chess program, " KnightCap, " used TDLeaf(λ) to learn its evaluation fun...

متن کامل

Self-Play and Using an Expert to Learn to Play Backgammon with Temporal Difference Learning

A promising approach to learn to play board games is to use reinforcement learning algorithms that can learn a game position evaluation function. In this paper we examine and compare three different methods for generating training games: 1) Learning by self-play, 2) Learning by playing against an expert program, and 3) Learning from viewing experts play against each other. Although the third po...

متن کامل

TDLeaf( ): Combining Temporal Difference Learning with Game-Tree Search

ABSTRACT In this paper we present TDLeaf( ), a variation on the TD( ) algorithm that enables it to be used in conjunction with minimax search. We present some experiments in both chess and backgammon which demonstrate its utility and provide comparisons with TD( ) and another less radical variant, TDdirected( ). In particular, our chess program, “KnightCap,” used TDLeaf( ) to learn its evaluati...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2011